OPTICS on Text Data: Experiments and Test Results
نویسندگان
چکیده
Clustering, particularly text clustering, in data mining has been attracting a lot of attention of late. There have been conventional techniques like K-means, which involve parameters that can’t be easily estimated. With the emergence of density-based clustering algorithms which have significant advantages, a lot of attention has been devoted to them. OPTICS [1] is the latest and most sophisticated technique in this direction, and has been shown to be considerably tolerant to value changes in parameters. To the best of our knowledge, this is the first study on the applicability of OPTICS on text data. We perform a variety of experiments towards this end using various feature selection techniques (which,as we show, assume greater significance in the context of density based clustering), quantify our results by way of explanations and list conclusions.
منابع مشابه
Optimization and Application of OPTICS Algorithm on Text Clustering
Text clustering is of great importance in data mining, information fusion, artificial intelligence and some other fields. There are many methods in literatures that can be used to classify text. Most of them require some parameters, such as the number of categories, which should be assigned in advance or estimated in classifying process. However, it is difficult to determine these quantities in...
متن کاملThe relationship between Iranian EFL learners’ gender and reading comprehension of three different types of text
The present study investigated the relationship between the reading comprehension of three types of text and the gender of Iranian EFL learners. To this end, several reading passages with the same length and readability were selected based on which a reading comprehension test was constructed on three different text types namely essay, history, and short story. After determining the validity an...
متن کاملThe Effect of Post-text Written Corrective Feedback on Written Grammatical Accuracy: Iranian intermediate EFL learners
The main role and responsibility of second language writing teachers is to help learners to write with minimal errors. To do so, teachers need to provide students with appropriate types of feedback. In this research, the researchers examined the effect of post-text written corrective feedback on written grammatical accuracy of Iranian intermediate EFL learners. In the first phase, Nelson Profic...
متن کاملImpact of Density and Distribution of Unfamiliar Lexical Items on Iranian EFL Learners’ Successful Reading Comprehension Achievement
Density and distribution of Unfamiliar Lexical Items (ULIs) appear to influence learners’ Reading Comprehension Achievement (RCA). This study concerns the impact of these two variables on Iranian EFL learners’ RCA. For this, two groups of students timetabled for the experiments designed to assess learners’ RCA. To determine the participants’ levels of proficiency a Quick Proficiency Test was fi...
متن کاملDictionary of Abstract and Concrete Words of the Russian Language: A Methodology for Creation and Application
The paper describes the first stage of a project on creating an electronic dictionary with numerical estimates of the degree of abstractness and concreteness of Russian words. Our approach is to integrate data obtained from several different sources: text corpora, psycholinguistic experiments, published dictionaries, markers of abstractness (certain suffixes) and a translation of a similar dict...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006